Generalized Mirror Descents in Congestion Games
نویسندگان
چکیده
Different types of dynamics have been studied in repeated game play, and one of them which has received much attention recently consists of those based on “no-regret” algorithms from the area of machine learning. It is known that dynamics based on generic no-regret algorithms may not converge to Nash equilibria in general, but to a larger set of outcomes, namely coarse correlated equilibria. Moreover, convergence results based on generic no-regret algorithms typically use a weaker notion of convergence: the convergence of the average plays instead of the actual plays. Some work has been done showing that when using a specific noregret algorithm, the well-known multiplicative updates algorithm, convergence of actual plays to equilibria can be shown and better quality of outcomes in terms of the price of anarchy can be reached for atomic congestion games and load balancing games. Are there more cases of natural no-regret dynamics that perform well in suitable classes of games in terms of convergence and quality of outcomes that the dynamics converge to? We answer this question positively in the bulletin-board model by showing that when employing the mirror-descent algorithm, a well-known generic no-regret algorithm, the actual plays converge quickly to equilibria in nonatomic congestion games. This gives rise to a family of algorithms, including the multiplicative updates algorithm and the gradient descent algorithm as well as many others. Furthermore, we show that our dynamics achieves good bounds on the outcome quality in terms of the price-of-anarchy type of measures with two different social costs: the average individual cost and the maximum individual cost. Finally, the bandit model considers a probably more realistic and prevalent setting with only partial information, in which at each time step each player only knows the cost of her own currently played strategy, but not any costs of unplayed strategies. For the class of atomic congestion games, we propose a family of bandit algorithms based on the mirror-descent algorithms previously presented, and show that when each player individually adopts such a bandit algorithm, their joint (mixed) strategy profile quickly converges with implications. ∗Part of the results in this paper have appeared in preliminary form in the proceedings of AAMAS 2014 [13] and as an extended abstract in AAMAS 2015 [14]. †Institute of Information Management, National Chiao Tung University, Taiwan. Email: [email protected]. Supported in part by NSC 102-2221-E-009-061-MY2. ‡Institute of Information Science, Academia Sinica, Taiwan. Email: [email protected]. 1 ar X iv :1 60 5. 07 77 4v 1 [ cs .G T ] 2 5 M ay 2 01 6
منابع مشابه
Generalized Mirror Descents with Non-Convex Potential Functions in Atomic Congestion Games
When playing specific classes of no-regret algorithms (especially, multiplicative updates) in atomic congestion games, some previous convergence analyses were done with a standard Rosenthal potential function in terms of mixed strategy profiles (probability distributions on atomic flows), which may not be convex. In several other works, the convergence analysis was done with a convex potential ...
متن کاملGeneralized mirror descents in congestion games with splittable flows
Different types of dynamics have been studied in repeated game play, and one of them which has received much attention recently consists of those based on “no-regret” algorithms from the area of machine learning. It is known that dynamics based on generic no-regret algorithms may not converge to Nash equilibria in general, but to a larger set of outcomes, namely coarse correlated equilibria. Mo...
متن کاملGeneralized Proof Number Search
We present Generalized Proof Number Search (GPNS), a Proof Number based algorithm able to prove positions in games with multiple outcomes. GPNS is a direct generalization of Proof Number Search (PNS) in the sense that both behave exactly the same way in games with two outcomes. However, GPNS targets a wider class of games. When a game features more than two outcomes, PNS can be used multiple ti...
متن کاملMaximal elements of $mathscr{F}_{C,theta}$-majorized mappings and applications to generalized games
In the paper, some new existence theorems of maximal elements for $mathscr{F}_{C,theta}$-mappings and $mathscr{F}_{C,theta}$-majorized mappings are established. As applications, some new existence theorems of equilibrium points for one-person games, qualitative games and generalized games are obtained. Our results unify and generalize most known results in recent literature.
متن کاملPlaying Congestion Games with Bandit Feedbacks
Almost all convergence results from each player adopting specific “no-regret” learning algorithms such as multiplicative updates or the more general mirror-descent algorithms in repeated games are only known in the more generous information model, in which each player is assumed to have access to the costs of all possible choices, even the unchosen ones, at each time step. This assumption in ge...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Artif. Intell.
دوره 241 شماره
صفحات -
تاریخ انتشار 2016